Log-Rank Test
The log-rank test can be performed using individual-level data, or on data that has been summarized
into a life-table format. In this section, we describe how to run a log-rank test with statistical software,
which is how it is usually done. Next, to help you understand the underlying calculations, we describe
the log-rank test calculations in detail using the life-table as you might carry them out manually using
spreadsheet software such as Microsoft Excel.
Understanding what the log-rank test is doing
A two-group log-rank test asks whether events — which are deaths in our example — are split
between the two groups in the same proportion as the number of at-risk individuals in the two groups.
The computer selects a group and sums the difference between the observed and expected number of
deaths in each time slice over all the time slices to get the total excess deaths for that group. The
excess death sum is then scaled down, meaning it is divided by an estimate of its standard deviation.
(Later in this chapter we describe how to calculate that standard deviation estimate.) The scaled-down
excess deaths sum is a number whose random sampling fluctuations should follow a normal
distribution, and from which a p value can be easily calculated. The null hypothesis of the log-rank test
is that there is no difference in survival between the two groups, so a p value less than your selected α
(usually 0.05) indicates a statistically significant difference.
Don’t worry if the preceding paragraph makes your head spin. It is only meant to give you a general
sense behind the calculations in the log-rank test.
Running the log-rank test on software
Most commercial statistical software packages (like those described in Chapter 4) can
perform a log-rank test. You first organize your data into a table that has one row per individual,
and these three columns:
Group: The group variable contains a code indicating the individual’s group. In this example, we
could use the code Drug = 1 and Control = 2.
Time: A numerical variable containing the individual’s survival time. For individuals
experiencing the event during the study, it represents time to event. For censored individuals, it is
time to the end of observation.
Event status: A variable that indicates the individual’s status at the end of observation. If they got
the event, it is usually coded as 1, and if not or they are censored, it is coded as 0.
To run the log-rank test, you tell your computer program which variable represents the group variable,
which one means time, and which one contains the event status. The program should produce a p value
for the log-rank test. If you set α = 0.05 and the p value is less than that, you reject the null and
conclude that the two groups have statistically significantly different survival curves.
In addition to the p value, the program may output median survival time for each group along with
confidence intervals, and difference in median times between groups. If possible, you will also want